SDA 3.5 Documentation for CASEStoDDL

NAME

CASEStoDDL - Create a DDL file from CASES (version 4)

DESCRIPTION

SDA programs can be used to document data collected by the CASES system for computer-assisted interviewing. In order to use the SDA programs, a DDL file needs to be generated that describes a specific data file generated by the CASES system. This document summarizes the procedures necessary to create such a DDL file.

There are many options available to customize the content of a DDL file produced from a CASES instrument. However, there are default options which will usually produce satisfactory results, at least as a starting point. The purpose of this document is to illustrate how to run the procedures in a simple way, taking advantage of the default options.

STEPS OF THE PROCESS

Create a list of variables to output
Create a list of cases to include in the data file
Create the data file by running the CASES ‘output’ program
Run the SDA ‘q4toddl’ program
Add study-level information to the DDL file
Check the DDL file

1. CREATE A LIST OF VARIABLES TO OUTPUT

Create a list of variables that you want to include in your data file. The list of variables should have one variable name on each line. Note that there are usually many variables used by the interviewing system that you will not want to pass through to the data file.

One way to obtain a list of variables is to run the CASES ‘layout’ program to generate a list of variables. When running the ‘layout’ program, use the ‘-b’ (brief) and the ‘-o’ (only variables) options.

The ‘-b’ option requests ‘brief’ output, with item names alone, and without other information on each item.
The ‘-o’ option requests that only items with a defined width be listed. This means that the ‘nodata’ items are omitted.

That list produced by ‘layout’ can then be edited down to the variables that are of substantive interest -- by deleting fills, and various other non-input items.

Save the final list of variables in a file named something like ‘myvars’, for input into the third stage of the process.

2. CREATE A LIST OF CASES TO INCLUDE IN THE DATA FILE

You will usually want to include only the completed cases in the data file. In order to do this, you must prepare a list of the cases to be output by the CASES system.

The CASES ‘caselist’ program produces lists of cases, according to criteria that you specify. The precise criteria to use depend on your treatment of completed cases -- whether they have been run through the second-stage cleaning process, for example. Typically, you would specify that the ‘caselist’ program produce a list of all cases that are in one of the following stages: in ‘middle’ or in ‘ready’ or in ‘certified’.

Save that list of cases in a file named something like ‘idlist’, for input into the next stage of the process

3. CREATE THE DATA FILE BY RUNNING THE CASES OUTPUT PROGRAM

To create the data file that will be documented, run the CASES ‘output’ program, using the ‘-i = filename’ option. (Do NOT run the CASES ‘output’ program without the ‘-i’ option. See below for an explanation of why not.) The ‘-i’ option is used to specify the name of a file containing a list of the variables that you want to include in the data file. This is the file you produced in step #1 above.

For example, if the file ‘myvars’ contains a list of variables for CASES to output, and if the file ‘idlist’ contains a list of the case IDs to be output, you could use the following command:

output -i=myvars -ou=mydata  idlist

In this example, the ‘output’ program would generate two files:

An ASCII data file named ‘mydata’.
A layout file named ‘myvars.lay’ that gives the locations of each variable in the file ‘mydata’.

If you do NOT use the ‘-i’ option, the ‘output’ program will produce a large data file with many variables you probably do not want to include in a dataset for analysis. Also, you will not get a layout file for the variables you want -- rather, you will have to rely on the comprehensive layout produced by the CASES ‘layout’ program. That layout refers to variables from the ZERO record as being located in record 0, which will cause problems if you try to pass those locations on to other programs.

4. RUN THE SDA ‘Q4TODDL’ PROGRAM TO MAKE A DDL FILE

The Q4TODDL program gathers information both from the CASES instrument and from the layout file, and then it puts the pieces together in the form of a DDL file. The text of questions and the category labels are taken from the CASES instrument. The location of each variable in the data file is taken from the layout file.

The process includes the following steps:

Change to the directory containing the instrument files, using a command like this:
```
cd \xstudy\e-inst
```
Create a file of commands (here named ‘q4toddl.txt’)
With any text editor that creates a plain ASCII file, create a file containing commands that specify the desired options for Q4TODDL. Any name will do for the file; in this example we will call the command file ‘q4toddl.txt’. An example that will cover most situations is the following:
```
#_____(Command file for Q4TODDL -- named ‘q4toddl.txt’)__________

# Note that lines beginning with ‘#’ are interpreted as comments.
# Begin the commands in column 1 of your file.

#
# The first command is a list of .q or .m files used by CASES
QFiles = file1.q file2.q
Varlist = myvars
Layout = myvars.lay
Output = myddl.txt


#__________(End of command file)_________________________________
```
The above command file will work fine, assuming that the list of variables you want is in ‘myvars’ and your layout file is ‘myvars.lay’, and you are running the Q4TODDL program in the same directory as the ‘.q’ files (or the macro-expanded ‘.m’ files). (For more options and explanations, see the full Q4TODDL document.)
The DDL output will be written to the file ‘myddl.txt’, which was the name specified in the command file.
Run the Q4TODDL program, giving the name of the command file after the ‘-b’ flag.
```
q4toddl -b q4toddl.txt
```
This command will create a DDL file (named ‘myddl.txt’ in the command file).
Examine the diagnostic output
Diagnostic and error messages are appended to the file ‘Q4TODDL.MSG’. It is always a good idea to take a look at that file.

5. ADD STUDY-LEVEL INFORMATION TO THE DDL FILE

Information about the dataset as a whole is not contained either in the CASES Q-language file or in the layout file. You can edit that information manually into the DDL file.

The main required elements are:

A study title: title= ...
The number of characters in each data record or line:
reclen = xxx
If the data file contains more than one record per case, you must indicate that as well:
records/case = n
The first variable definition after the study-level information MUST be a variable named ’CASEID’. You can edit the location and description of a suitable variable into the blank field produced by Q4TODDL.

For a complete description of the required format of a DDL file, see the DDL document.

6. CHECK THE DDL FILE

After you have run Q4TODDL, added the required study-level information, and made any other changes you want, you can check the resulting DDL file for syntax errors. The MAKESDA program will do this for you, if you use the ‘-c’ option.

For a DDL file named ’myddl.txt’, you would give the following command:

makesda -c -l myddl.txt

Some messages will appear on the screen. A fuller report will be appended to the file ‘MAKESDA.MSG’. Also, note that a list of all variables processed will be put into the file ‘MAKESDA.LST’.

Once you have a DDL file without errors, you can proceed to create an SDA dataset and generate a codebook.

DDL	Data Description Language
makesda	Make an SDA dataset from a DDL file and a data file
q4toddl	Convert CASES Q-language files to DDL
xcodebk	Produce a codebook